Dynamic Indexability: The Query-Update Tradeoff for One-Dimensional Range Queries
نویسنده
چکیده
The B-tree is a fundamental secondary index structure that is widely used for answering one-dimensional range reporting queries. Given a set of N keys, a range query can be answered in O(log B N M + K B ) I/Os, where B is the disk block size, K the output size, and M the size of the main memory buffer. When keys are inserted or deleted, the Btree is updated in O(log B N) I/Os, if we require the resulting changes to be committed to disk right away. Otherwise, the memory buffer can be used to buffer the recent updates, and changes can be written to disk in batches, which significantly lowers the amortized update cost. A systematic way of batching up updates is to use the logarithmic method, combined with fractional cascading, resulting in a dynamic B-tree that supports insertions in O( 1 B log N M ) I/Os and queries in O(log N M + K B ) I/Os. Such bounds have also been matched by several known dynamic B-tree variants in the database literature. Note that, however, the query cost of these dynamic B-trees is substantially worse than the O(log B N M + K B ) bound of the static B-tree by a factor of Θ(logB). In this paper, we prove that for any dynamic one-dimensional range query index structure with query cost O(q + K B ) and amortized insertion cost O(u/B), the tradeoff q · log(u/q) = Ω(logB) must hold if q = O(log B). For most reasonable values of the parameters, we have N M = B, in which case our query-insertion tradeoff implies that the bounds mentioned above are already optimal. We also prove a lower bound of u · log q = Ω(logB), which is relevant for larger values of q. Our lower bounds hold in a dynamic version of the indexability model, which is of independent interests. Dynamic indexability is a clean yet powerful model for studying dynamic indexing problems, and can potentially lead to more interesting complexity results.
منابع مشابه
Towards a Theory of Indexability (
We consider the problem of indexing general database workloads (combinations of data sets and sets of potential queries). We identify two measures of eeciency of an indexing scheme for a work load: storage redundancy (how many times each item in the data set is stored), and disk access ratio (how many times more blocks than necessary does a query retrieve). We show interesting upper and lower b...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملOne - dimensional range searching . Two - dimensional range - searching
A dynamic algorithm was characterized by the concept of an “on-line” sequence of operations on a data structure. These operations being query and update. Each operation may be performed before the next operation is known. For example, if you had a directed graph, an intermixing of updates and queries to this graph can be made. A query might be: is there a directed path between two nodes of the ...
متن کاملDynamic Planar Range Maxima Queries
We consider the dynamic two-dimensional maxima query problem. Let P be a set of n points in the plane. A point is maximal if it is not dominated by any other point in P . We describe two data structures that support the reporting of the t maximal points that dominate a given query point, and allow for insertions and deletions of points in P . In the pointer machine model we present a linear spa...
متن کاملExternal Memory Three-Sided Range Reporting and Top-k Queries with Sublogarithmic Updates
An external memory data structure is presented for maintaining a dynamic set ofN two-dimensional points under the insertion and deletion of points, and supporting unsorted 3-sided range reporting queries and top-k queries, where top-k queries report the k points with highest y-value within a given x-range. For any constant 0 < ε ≤ 1 2 , a data structure is constructed that supports updates in a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/0811.4346 شماره
صفحات -
تاریخ انتشار 2008